The database engine that has always shipped with REALbasic is being replaced with a new database engine, also known as the REAL Database (the old engine will hereafter be known as "REAL Database (Old Format)". This document describes the features and usage of the new REAL database engine.
General Features
A new REAL database can store its data on disk in either of two ways. It can keep them all neatly encapsulated inside a virtual volume file (a single file on disk which can be treated like a virtual disk in REALbasic), or it can store them as real files in a normal folder on disk.
New REAL data tables may be added, dropped, or mutated in various ways (columns or indexes added or dropped), without losing or recopying the existing data.
The new REAL database supports a subset of SQL/92 and SQL/99 (details below), including queries that involve self-joins, aggregate functions, and more. Our intent is that for the set of features the new REAL Database engine supports, its syntax is fully SQL compliant (with a few minor extensions, like the Boolean data type). It also returns standard SQL error codes.
Every new REAL database table has a special column called "_rowid" which is a unique identifier for that row. This is added automatically, and serves as a convenient join field for building relational databases.
An SQLSelect returns a dynamic cursor; you can move forwards, backwards, or jump to the beginning or end as much as you like.
The new REAL database engine supports transactions, both for schema changes and for data changes. Per the SQL standard, a transaction is started automatically when you make any change to the database, and ended by COMMIT or ROLLBACK.
The new REAL database also supports encodings. When you insert records into the database, the encoding of the text fields (which is to say, the encoding of the SQL statement that inserts them) is remembered by the database. When you retrieve this data later, the encoding is restored. So, unlike most other database engines, there is no need to use DefineEncoding on text data retrieved from a REAL database.
Backwards Compatibility Notes
The new REAL database does not support column constraints; neither NOT NULL nor PRIMARY KEY will work. However, you can now index any column, whereas the old one did not support indexes except for the primary key.
The new database does not support the use of multi-word column names enclosed in square brackets. This feature may return if there is sufficient demand, but for now, your column names should not include any whitespace. (You can use the "AS" syntax to name result columns in your query.)
The old database engine supported both single-quotes and double-quotes as string delimiters; per the SQL standard, the new engine supports only single-quotes.
The new database engine does not support the MacPICT column type. It does, however, support a Binary (blob) column type.
There are many other minor differences, as we have moved to be more in line with the SQL standards. But most of the differences are new features which will not affect older projects. Those new features are briefly described above, and detailed below. In addition, there were many undocumented limitations of the old engine (e.g., ORDER BY did not work for queries involving a join) which do not exist in the new REAL database.
Data Types Supported
New REAL database fields may be of any of the following types:
Type Numeric Code Comments
---- ------------ --------
Integer 3 32-bit signed integer
VarChar 5 text up to 2^31 bytes (see Note 1)
Double 7 64-bit floating-point number
Date 8 day, in YYYY-MM-DD format
Time 9 time, in HH:MM:SS format
TimeStamp 10 time stamp, in YYYY-MMM-DD HH:MM:SS format
Boolean 12 boolean value, TRUE or FALSE (see Note 2)
Binary 14 binary data up to 2^31 bytes
String 18 text up to 2^31 bytes (see Note 1)
Note 1: Text fields in the new REAL database store encoding information as well as the text bytes.
Note 2: Boolean is not a SQL data type, and TRUE or FALSE are not SQL keywords. These are our own extensions to SQL.
Creating or Opening a Database in the IDE
If you want your new REAL database to be part of your project, then you can use the "Add Data Source" command in the File menu of the IDE to either create a new database file, or attach an existing one. When you make a new one, it is stored next to your project file, and when you ship your built app. When you add an existing one, the database file must already bee in the same folder as your project file (or built app). [In the future, we may search subdirectories of that folder as well.]
Creating or Opening a Database in REALbasic Code
You can also choose to create or open a new REAL database at run-time. You would do this by creating an instance of the REALDatabase class, assigning an appropriate location to its databaseFile property, and then calling either CreateDatabaseFile (to make a new database) or Connect (to open an existing database). Note that if the databaseFile refers to an already-existing folder, the data files will be stored within that folder; otherwise a virtual volume will be created and the files stored therein.
SQL Syntax Supported
A semi-formal specification of the SQL syntax supported by the new REAL database engine appears below. It uses the following conventions: a vertical bar ("|") separates items in a list of alternatives; square brackets ("[" and "]") indicate an optional part which may be included or omitted; and curly braces ("{" and "}") indicate a required part which may not be omitted. Items in angle brackets (e.g. "<select-item-list>") indicate a more complex part which may be detailed in its own specification entry. Any such item which ends in "-string" is a simple identifier (e.g. "foo"). Examples are presented in the next section.
valid-sql-input:
{ <select-statement>
| <insert-statement>
| <update-statement>
| <delete-statement>
| <create-table-statement>
| <create-index-statement>
| <alter-table-statement>
| <drop-table-statement>
| <drop-index-statement>
| COMMIT
| ROLLBACK
}
select-statement:
SELECT [ ALL | DISTINCT ] <select-item-list>
FROM <table-list>
[ WHERE <expression> ]
[ GROUP BY <column-ref-list> ]
[ HAVING <expression> ]
select-item-list: one or more comma-separated items of the form:
<column-expression> [ [ AS ] <column-name-string> ]
table-list: one or more comma-separated items of the form:
<table-name-string> [ [ AS ] <range-variable-string> ]
insert-statement:
INSERT INTO <table-name-string> [ ( <column-name-list> ) ]
{ DEFAULT VALUES
| VALUES ( <expression-list> )
| <select-statement>
}
update-statement:
UPDATE <table-name-string> SET <assignment-list>
[ WHERE <expression> ]
assignment-list: one or more comma-separated items of the form:
| <expression> [ NOT ] LIKE <expression> [ ESCAPE <expression> ]
| <expression> { + | - } <expression>
| <expression> { * | / } <expression>
| -<expression>
| ( <expression> )
| <function>
| { <number> | <string-literal> | TRUE | FALSE }
| DATE <date-string>
| TIME <time-string>
| TIMESTAMP <timestamp-string>
| <column-reference>
}
column-reference:
{ <column-name-string>
| <range-variable-string>.<column-name-string>
| *
| <range-variable-string>.*
}
function:
{ BIT_LENGTH ( <expression> )
| OCTET_LENGTH ( <expression> )
| CHAR_LENGTH ( <expression> )
| CURRENT_DATE
| CURRENT_TIME
| CURRENT_DATETIME
| LOWER ( <expression> )
| UPPER ( <expression> )
| POSITION ( <expression> IN <expression> )
| SUBSTRING ( <expression> FROM <expression> [ FOR <expression> ] )
| TRIM ( [ LEADING | TRAILING | BOTH ]
[ <expression> ] [ FROM ] <expression> )
| AVG ( [ DISTINCT | ALL ] expression> )
| MAX ( [ DISTINCT | ALL ] <expression> )
| MIN ( [ DISTINCT | ALL ] <expression> )
| SUM ( [ DISTINCT | ALL ] <expression> )
| COUNT ( [ DISTINCT | ALL ] <expression> )
| COUNT (*)
| LAST_ROWID ( 'table-name-string' )
}
Note that in ALTER TABLE and DROP TABLE, there is no difference between RESTRICT and CASCADE. In the TRIM function, the FROM keyword is required if either of the two preceeding parts (LEADING/TRAILING/BOTH and/or the pad character) are specified.
Result Codes
The new REAL database returns an error code as well as an error message. In the error message, the first five characters are the SQL standard error code, as follows:
00000 -- Success
02000 -- No records found.
01004 -- Warning: String truncated on right.
01S09 -- Warning: Invalid SQL keyword.
07000 -- Error: Dynamic SQL error.
42S01 -- Error: Table already exists.
42S02 -- Error: Table not found.
42S11 -- Error: Index already exists.
42S12 -- Error: Index not found.
42S21 -- Error: Column already exists.
42S22 -- Error: Column not found.
21S01 -- Error: Insert value list does not match column list.
22000 -- Error: Data exception.
22023 -- Error: Invalid parameter.
25000 -- Error: Invalid transaction state.
0A000 -- Error: Unsupported SQL feature.
72000 -- Error: Internal SQL processor error.
SQL Examples
The following examples assume a database with two tables: Movies (containing Title, Director, and Year), and Actors (containing Name and Movie, where Movie holds the _rowid value of the corresponding row in the Movies table). For each example, we present a plain English description, the SQL syntax, the result code, and the values returned by the query (where applicable, and assuming a handful of pre-existing data).
EXAMPLE: Prove that SQL can do math.
SQL: select 2+3+2*3
RESULT: 00000
11
EXAMPLE: List all fields for all movies in the database.
SQL: select * from Movies
RESULT: 00000
1,Star Wars,George Lucas,1977
2,Raiders of the Lost Ark,George Lucas,1981
3,American Graffiti,George Lucas,1973
4,Apollo 13,Ron Howard,1995
5,Cast Away,Robert Zemeckis,2000
EXAMPLE: List distinct actors (i.e., don't list any actor more than once).
SQL: select distinct Name from Actors
RESULT: 00000
Harrison Ford
Mark Hamill
Carrie Fisher
Ron Howard
Richard Dreyfuss
Tom Hanks
EXAMPLE: Find all actors who were in Star Wars.
SQL: select a.name from Actors A, Movies M where a.movie=m._rowid and m.title='Star Wars'
RESULT: 00000
Harrison Ford
Mark Hamill
Carrie Fisher
EXAMPLE: Find actors, with movie title and year, for all movies directed by George Lucas.
SQL: select A.Name,M.Title,M.Year from Actors A, Movies M where A.movie=M._rowid and M.Director='George Lucas'
RESULT: 00000
Ron Howard,American Graffiti,1973
Richard Dreyfuss,American Graffiti,1973
Harrison Ford,Raiders of the Lost Ark,1981
Harrison Ford,Star Wars,1977
Mark Hamill,Star Wars,1977
Carrie Fisher,Star Wars,1977
EXAMPLE: See if Joe has been in any movies.
SQL: select Name from Actors where Name='Joe Strout'
RESULT: 02000
EXAMPLE: Find all actors who have done a movie with Harrison Ford.
SQL: select A.name from Actors A, Actors B where A.movie = B.movie and B.name='Harrison Ford' and A.name<>'Harrison Ford'
RESULT: 00000
Mark Hamill
Carrie Fisher
EXAMPLE: Find anyone who is both an actor and a director.
SQL: select A.Name from Actors A, Movies M where A.Name = M.Director
RESULT: 00000
Ron Howard
EXAMPLE: Add a new record to the Movies table.
SQL: insert into Movies (title, director, year) values ('Joe''s Home Movie', 'Joe Strout', 2003)
RESULT: 00000
EXAMPLE: Verify that the record was added correctly.
SQL: select director, title from Movies where title = 'Joe''s Home Movie'
RESULT: 00000
Joe Strout,Joe's Home Movie
EXAMPLE: Delete the record we just added.
SQL: delete from Movies where director='Joe Strout'
RESULT: 00000
EXAMPLE: Attempt a table alteration while there's a transaction in progress.
SQL: alter table Movies add column Rating integer
RESULT: 25000
EXAMPLE: Find any movies with the word "War" in the title (ignoring case).
SQL: select title, year from Movies where Upper(title) like '%WAR%'
RESULT: 00000
Star Wars,1977
EXAMPLE: Find the director of any move not made in the 1900s, and count the characters in his name.
SQL: select director, char_length(director) from Movies where year not like '19__'
RESULT: 00000
Robert Zemeckis,15
EXAMPLE: Find out when each director made his first and last movie (in our database).
SQL: select Director, Min(Year), Max(Year) from Movies group by Director
RESULT: 00000
George Lucas,1973,1981
Robert Zemeckis,2000,2000
Ron Howard,1995,1995
EXAMPLE: Find out how many actors we have for each movie (by title).
SQL: select A.Title, Count(B.Name) from Movies A, Actors B where B.Movie = A._rowID group by B.Movie
RESULT: 00000
Star Wars,3
Raiders of the Lost Ark,1
American Graffiti,2
Apollo 13,1
Cast Away,1
EXAMPLE: Find how many movies each actor has done, but display only those who have done more than one.
SQL: select Name, Count(*) from Actors group by Name having Count(*) > 1
Known Limitations
No table can be greater than 2 GB total size. When storing files in a virtual volume, the entire database must also be under 2 GB. Row IDs increment automatically and are never reused; so you can't insert more than 2^32 (about 4 billion) records into any table. [We intend to remove that limitation.]